Recent Topics in Speech Recognition Research at NTT Laboratories

نویسندگان

  • Sadaoki Furui
  • Kiyohiro Shikano
  • Shoichi Matsunaga
  • Tatsuo Matsuoka
  • Satoshi Takahashi
  • Tomokazu Yamada
چکیده

This paper introduces three recent topics in speech recognition research at NTT (Nippon Telegraph and Telephone) Human Interface Laboratories. The first topic is a new HMM (hidden Markov model) technique that uses VQ-code bigrams to constrain the output probability distribution of the model according to the VQ-codes of previons frames. The output probability distribution changes depending on the previous frames even in the same state, so this method reduces the overlap of feature distributions with different phonemes. The second topic is approaches for adapting a syllable trigram model to a new task in Japanese continuous speech recognition. An approach which uses the most recent input phrases for adaptation is effective in reducing the perplexity and improving phrase recognition rates. The third topic is stochastic language models for sequences of Japanese characters to be used in a Japanese dictation system with unlimited vocabulary. Japanese characters consist of Kanji (Chinese characters) and Kana (Japanese alphabets), and each Kanji has several readings depending on the context. Our dictation system uses character-trigram probabilities as a source model obtained from a text database consisting of both Kanji and Kana~ and generates Kanji-and-Kana sequences directly from input speech. 1. PHONEME HMM CONSTRAINED BY STATISTICAL VQ-CODE TRANSITION 1.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Selected topics from 40 years of research on speech and speaker recognition

This paper summarizes my 40 years of research on speech and speaker recognition, focusing on selected topics that I have investigated at NTT Laboratories, Bell Laboratories and Tokyo Institute of Technology with my colleagues and students. These topics include: the importance of spectral dynamics in speech perception; speaker recognition methods using statistical features, cepstral features, an...

متن کامل

Research and Development of Robust Speech Recognition

This paper describes recent research and development activities on robust ASR (automatic speech recognition) in NTT Human Interface Laboratories. ASR system design has been changing from the experimental to the commercial level. A relevant issue in achieving practical ASR is robustness against environmental noise and speaker/circuit differences. Adaptation techniques have been widely investigat...

متن کامل

Session 5a: Acoustic Modeling

The session focused on acoustic modeling for speech recognition ; which can be segmented into three broad sub-areas: (1) feature extraction, (2) modeling the features for the speech source, and (3) estimation of the model parameters. The papers in this session touched on all of these areas. Huang focuses on the feature representation. Furui et al., Austin et al., and Kimbal et al. discuss new m...

متن کامل

Speech Emotion Recognition Based on Power Normalized Cepstral Coefficients in Noisy Conditions

Automatic recognition of speech emotional states in noisy conditions has become an important research topic in the emotional speech recognition area, in recent years. This paper considers the recognition of emotional states via speech in real environments. For this task, we employ the power normalized cepstral coefficients (PNCC) in a speech emotion recognition system. We investigate its perfor...

متن کامل

Broadcast Technology

Closed captioning to convey the speech of TV programs by text is becoming a useful means of providing information for elderly people and the hearing impaired, and real-time captioning of live programs is expanding yearly thanks to the use of speech recognition technology and special keyboards for high-speed input. This paper describes the current state of closed captioning, provides an overview...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1992